-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Baked data for icu_plurals
, icu_timezone
#3462
Conversation
90dd7c8
to
1250ef8
Compare
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This reverts commit 961b6c1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work; need to align on open questions
components/plurals/build.rs
Outdated
// called LICENSE at the top level of the ICU4X source tree | ||
// (online at: https://github.com/unicode-org/icu4x/blob/main/LICENSE ). | ||
|
||
fn main() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two separate discussions in this thread.
Custom globaldata path: In #2992 I proposed the dual features approach, but I like the cfg
better. A disadvantage of the dual features is that you have the --all-features
problem.
Macros in crates: Unless my assumptions are wrong, I see it as a non-starter to call macros in crates they don't belong to. It makes the crates heavier and much more difficult for DCE. There are a couple solution we discussed previously:
- The crates export a DataProvider implementing the bounds they need, and client crates fork between them manually (fork-by-key DataProvider works if you manually list out every key you need to fork on).
- The globaldata constructors call other globaldata constructors directly. This creates two code paths, one for globaldata and one for custom data, which isn't great, but most constructors are thin so it's probably fine.
/// let pr = PluralRules::try_new_unstable( | ||
/// &icu_testdata::unstable(), | ||
/// let pr = PluralRules::try_new( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussion: I'm not sure yet what we want to put as the primary example in the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm more and more of the opinion that anything with a provider should be considered a power-user feature. Google and Mozilla need it, but most users will not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not great for sample code to require enabling a feature flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I assume you're opposed to making it a default feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we come to an agreement that the data lives inside the component crates, then it should probably be a default feature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does that matter? We can also pull in another crate by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you want me to re-review
@@ -0,0 +1,28 @@ | |||
# This file is part of ICU4X. For terms of use, please see the file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I see, you implemented separate crates but distributed them inside the individual components instead of in a centralized location
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's please not block on where in the repo these will live
@@ -53,6 +55,7 @@ serde = ["dep:serde", "zerovec/serde", "icu_locid/serde", "icu_provider/serde"] | |||
datagen = ["serde", "zerovec/databake", "dep:databake"] | |||
experimental = [] | |||
bench = ["serde"] | |||
data = ["dep:icu_plurals_data"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted elsewhere, we need to bikeshed the feature name (not a landing blocker)
components/plurals/data/Cargo.toml
Outdated
# (online at: https://github.com/unicode-org/icu4x/blob/main/LICENSE ). | ||
[package] | ||
name = "icu_plurals_data" | ||
version = "43.0.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: how about starting at "0.1.0" and incrementing to "0.1.1", "0.1.2", ... for CLDR updates and new data structs and "0.2.0" when there is a breaking data struct change. Then it's clearly a different versioning scheme than the rest of ICU4X.
(Maybe put new data structs also in a new channel, like breaking changes, so that minor releases are reserved for CLDR and we can continue to update old channels if desired)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we bikeshed versioning later?
}, | ||
"locales": { | ||
"CldrSet": [ | ||
"modern" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to discuss the CldrSet to use (doesn't need to block this PR)
icu_plurals
, icu_timezone
icu_plurals
, icu_timezone
icu_plurals
, icu_timezone
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing my nitpicks 😃 This is good to land now and we can continue bikeshedding the details
#2945
This adds data to subcrates (i.e.
<component>/data
). Users can provide custom baked data by setting theICU4X_DATA_DIR
environment variable during compilation. Crates will then fail to compile if the data doesn't contain the required keys.The baked data uses
--locales modern
. I think that's a good default, if clients want other sets of languages they will have to use datagen.While some of the generated files are big, the compressed tarballs for crates.io all fit comfortably within the 10MB limit.
On the API side, I propose adding
try_new[_with_locale[_with_options]]
constructors. These provider-less constructors become the main documentation entry point, with constructors being listed in the order*
,_with_any_provider
,*_with_buffer_provider
,*_unstable
. The provider-constructors refer to the simple constructor's documentation. Example.In locale-agnostic components (calendar, normalizer, locid_transform, properties, timezone, casemapping) we can create infallible constructors, which can usually be
const
(as these don't take arguments, they could also beconst
values 1). Example.Footnotes
My thinking here is to use
const fn
for constructors, andconst
values for what would be free functions, like inicu_properties
. ↩